Supervised classification of news articles by whether they mention diseases and outbreaks

نویسنده

  • Mason Chua
چکیده

Global Viral Forecasting is working to use data extracted from humanreadable web documents in order to predict when and where disease outbreaks will happen. As part of this effort, it is necessary to classify documents by whether they are relevant to this prediction process. For this reason, GVF has collected 75,176 manual annotations of web documents. These annotations include which of the following mutually exclusive classes the document falls into:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents

Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...

متن کامل

Lost in Space: Geolocation in Event Data

Extracting the “correct” location information from text data, i.e., determining the place of event, has long been a goal for automated text processing. To approximate human-like coding schema, we introduce a supervised machine learning algorithm that classifies each location word to be either correct or incorrect. We use news articles collected from around the world (Integrated Crisis Early War...

متن کامل

Finding Bias in Political News and Blog Websites

News and blog websites often have political bias (such as Republican, Democratic) in their articles. Automatic detection of the bias will improve personalized feed and categorization of news and blog articles. Our project aims to predict Republican vs. Democratic bias of news websites and political blogs using the phrases (a.k.a. memes) they quote in their text. We form a bipartite graph of web...

متن کامل

Corporate News Classification and Valence Prediction: A Supervised Approach

News articles have always been a prominent force in the formation of a company’s financial image in the minds of the general public, especially the investors. Given the large amount of news being generated these days through various websites, it is possible to mine the general sentiment of a particular company being portrayed by media agencies over a period of time, which can be utilized to gau...

متن کامل

Temporal Topic Modeling to Assess Associations between News Trends and Infectious Disease Outbreaks

In retrospective assessments, internet news reports have been shown to capture early reports of unknown infectious disease transmission prior to official laboratory confirmation. In general, media interest and reporting peaks and wanes during the course of an outbreak. In this study, we quantify the extent to which media interest during infectious disease outbreaks is indicative of trends of re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011